A two-stage speech recognition method for information retrieval applications
نویسندگان
چکیده
This paper presents a two-stage approach to speech recognition that is suited for information retrieval tasks, e.g. accessing a large telephone directory. The rst stage performs a Viterbi beam search to decode the speech input into a sequence of phonemes. The second stage performs a graph search to match the phoneme sequence with a large list of keywords. The key issue is that the rst step employs a syllable based language model that does not necessarily depend on the application domain. Experimental results are shown for a telephone directory access task of one million of entries.
منابع مشابه
Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition
Speech recognition has of late become a practical technology for real world applications. Aiming at speech-driven text retrieval, which facilitates retrieving information with spoken queries, we propose a method to integrate speech recognition and retrieval methods. Since users speak contents related to a target collection, we adapt statistical language models used for speech recognition based ...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملارائه یک روش جدید بازیابی اطلاعات مناسب برای متون حاصل از بازشناسی گفتار
In this article a pre-processing method is introduced which is applicable in speech recognized texts retrieval task. We have a text corpus, t generated from a speech recognition system and a query as inputs, to search queries in these documents and find relevant documents. A basic problem in a typical speech recognized text is some error percentage in recognition. This, results erroneously ass...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملSegment-based phonetic class detection using minimum verification error (MVE) training
In this paper, we investigate the performance of segment-based detectors for three taxonomic sets of acoustic-phonetic classes. Acoustic-phonetic detectors form an important processing layer for speech event decoding in the new detection-based automatic speech recognition. In this study, detectors are trained within a minimum verification error (MVE) framework which is markedly different from t...
متن کامل